home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Software Vault: The Gold Collection
/
Software Vault - The Gold Collection (American Databankers) (1993).ISO
/
cdr49
/
303_01.zip
/
DOC
< prev
next >
Wrap
Text File
|
1993-04-01
|
9KB
|
265 lines
A Disassembler
1. Introduction
This document describes the first release of a disassembler
for UNIX executable files. The key features are:
1. For object files the output can be assembled to
generate the same object module, (apart from minor
variations in symbol table ordering) as the input.
2. For stripped executable files object modules and
libraries may be scanned, modules in the main input
identified and the appropriate names automatically
inserted into the output.
3. An option is available to convert most non-global
names into local symbols, which cuts down the symbols
in the generated assembler file.
4. The disassembler copes reasonably with modules merged
with the -r option to ld, generating a warning message
as to the number of modules involved.
At present this is available for certain Motorola 68000
ports of UNIX System III and System V. Dependencies on
a. Instruction set.
b. Object module format.
c. Library module format.
d. Assembler output format.
are hopefully sufficiently localised to make the product
useful as a basis for other disassemblers for other versions
of UNIX.
The product is thus distributed in source form at present.
2. Use
The disassembler is run by entering:
unc mainfile lib1 lib2 ...
The first named file is the file to be disassembled, which
should be a single file, either an object module, a
(possibly stripped) executable file, or a library member.
Library members are designated using a parenthesis notation,
thus:
Issue 1I1 - Page 1 - 1G1
A Disassembler
unc '/lib/libc.a(printf.o)'
It is usually necessary to escape the arguments in this case
to prevent misinterpretation by the shell. Libraries in
standard places such as /lib and /usr/lib may be specified
in the same way as to ld, thus
unc '-lc(printf.o)'
unc '-lcurses(wmove.o)'
As an additional facility, the list of directories searched
for libraries may be varied by setting the environment
variable LDPATH, which is interpreted similarly to the shell
PATH variable, and of course defaults to
LDPATH=/lib:/usr/lib
As a further facility, the insertion of lib before and .a
after the argument may be suppressed by using a capital -L
argument, thus to print out the assembler for /lib/crt0.o,
then the command
unc -Lcrt0.o
should have the desired effect.
Second and subsequent file arguments are only referenced for
stripped executable files, and may consist of single object
files and library members, using the same syntax as before,
or whole libraries of object files, thus:
unc strippedfile -Lcrt0.o -lcurses -ltermcap '-lm(sqrt.o)' -lc
It is advisable to make some effort to put the libraries to
be searched in the order in which they were originally
loaded. This is because the search for each module starts
where the previously matched module ended. However, no harm
is done if this rule is not adhered to apart from increased
execution time except in the rare cases where the
disassembler is confused by object modules which are very
nearly similar.
3. Additions
The following options are available to modify the behaviour
of the disassembler.
-o file Causes output to be sent to the specified
file instead of the standard output.
Issue 2I2 - Page 2 - 2G2
A Disassembler
-t prefix Causes temporary files to be created with the
given prefix. The default prefix is split,
thus causing two temporary files to be
created with this prefix in the current
directory. If it is desired, for example, to
create the files as /tmp/xx*, then the
argument -t /tmp/xx should be given. Note
that the temporary files may be very large as
a complete map of the text and data segments
is generated.
-a Suppresses the generation of non-global
absolute symbols from the output. This saves
output from C compilations without any
obvious problems, but the symbols are by
default included in the name of producing as
nearly identical output as possible to the
original source.
-s Causes an additional scan to take place where
all possible labels are replaced by local
symbols. The local symbols are inserted in
strictly ascending order, starting at 1.
-v Causes a blow-by-blow account of activities
to be output on the standard error.
-V Causes shlib symbol values to be printed at
the top of the output, if they exist in the
symbol table. normally these are ommited
because they are constant values derived form
the file /lib/shlib.ifile
4. Diagnostics etc
Truncated or garbled object and library files usually cause
processing to stop with an explanatory message.
The only other kinds of message are some passing warnings
concerning obscure constructs not handled, such as the
relocation of byte fields, or the relocation of overlapping
fields. Occasionally a message
Library clash: message
may appear and processing cease. This message is found where
at a late stage in processing libraries, the program
discovers that due to the extreme similarity of two or more
library members, it has come to the wrong conclusion about
which one to use. The remedy here is to spell out to the
Issue 3I3 - Page 3 - 3G3
A Disassembler
program which members to take in which order.
5. Future development
In the future it is hoped to devise ways of making the
disassembler independent of all the above-mentioned version
dependencies, by first reading a files defining these
things. This will probably be applied after the Common
Object Format becomes more standard.
In the long term it would be desirable and useful to enhance
the product to produce compilable C in addition to
assemblable assembler. Stages in the process are seen as
follows:
1. Better identification of basic blocks in the code.
Switch statements are a major problem here, as are
constant data held in the text segment.
2. Marrying of data to the corresponding text. It is in
various places hard to divorce static references "on
the fly" (e.g. strings, and switch lists in some
implementations) from static at the head of a module.
This is part of the problem of identifying basic
blocks.
3. Compilation of header files to work out structure
references within the text. At this stage some
interaction may be needed.
Meanwhile the product is one which is a useful tool to the
author in its present form. Comments and suggestions as to
the most practical method of improving the product in the
ways suggested or in other ways would be gratefully
considered.
Issue 4I4 - Page 4 - 4G4